Scalable data visualisation

Ben Lambert

Scalable data visualisation via grammar of graphics

Grammar of graphics

  • single framework that easily generalises to larger sets of variables with complex hierarchies
  • framework for a huge variety of graphics
  • agile exploration of data and creation of coherent graphics

Suicide rates: 1985 to 2016

country year sex age suicides_no population suicides.100k.pop country.year HDI.for.year gdp_for_year…. gdp_per_capita…. generation
Bahrain 1985 male 25-34 years 7 67600 10.36 Bahrain1985 0.727 3,651,861,702 9980 Boomers
Bahrain 1985 male 35-54 years 3 49700 6.04 Bahrain1985 0.727 3,651,861,702 9980 Silent
Bahrain 1985 female 35-54 years 1 26900 3.72 Bahrain1985 0.727 3,651,861,702 9980 Silent
Bahrain 1985 female 15-24 years 0 37800 0.00 Bahrain1985 0.727 3,651,861,702 9980 Generation X
Bahrain 1985 female 25-34 years 0 27600 0.00 Bahrain1985 0.727 3,651,861,702 9980 Boomers
Bahrain 1985 female 5-14 years 0 41400 0.00 Bahrain1985 0.727 3,651,861,702 9980 Generation X
Bahrain 1985 female 55-74 years 0 8700 0.00 Bahrain1985 0.727 3,651,861,702 9980 G.I. Generation
Bahrain 1985 female 75+ years 0 1500 0.00 Bahrain1985 0.727 3,651,861,702 9980 G.I. Generation
Bahrain 1985 male 15-24 years 0 49700 0.00 Bahrain1985 0.727 3,651,861,702 9980 Generation X
Bahrain 1985 male 5-14 years 0 42000 0.00 Bahrain1985 0.727 3,651,861,702 9980 Generation X
Bahrain 1985 male 55-74 years 0 11600 0.00 Bahrain1985 0.727 3,651,861,702 9980 G.I. Generation
Bahrain 1985 male 75+ years 0 1400 0.00 Bahrain1985 0.727 3,651,861,702 9980 G.I. Generation
Bahrain 1987 male 55-74 years 1 11800 8.47 Bahrain1987 NA 3,392,021,011 9321 G.I. Generation
Bahrain 1987 male 35-54 years 3 47700 6.29 Bahrain1987 NA 3,392,021,011 9321 Silent
Bahrain 1987 male 25-34 years 3 63900 4.69 Bahrain1987 NA 3,392,021,011 9321 Boomers
Bahrain 1987 female 25-34 years 1 27500 3.64 Bahrain1987 NA 3,392,021,011 9321 Boomers
Bahrain 1987 female 15-24 years 1 38600 2.59 Bahrain1987 NA 3,392,021,011 9321 Generation X
Bahrain 1987 female 35-54 years 0 27300 0.00 Bahrain1987 NA 3,392,021,011 9321 Silent
Bahrain 1987 female 5-14 years 0 42500 0.00 Bahrain1987 NA 3,392,021,011 9321 Generation X
Bahrain 1987 female 55-74 years 0 8900 0.00 Bahrain1987 NA 3,392,021,011 9321 G.I. Generation
Bahrain 1987 female 75+ years 0 1500 0.00 Bahrain1987 NA 3,392,021,011 9321 G.I. Generation
Bahrain 1987 male 15-24 years 0 49400 0.00 Bahrain1987 NA 3,392,021,011 9321 Generation X
Bahrain 1987 male 5-14 years 0 43200 0.00 Bahrain1987 NA 3,392,021,011 9321 Generation X
Bahrain 1987 male 75+ years 0 1600 0.00 Bahrain1987 NA 3,392,021,011 9321 G.I. Generation
Bahrain 1988 male 25-34 years 8 61600 12.99 Bahrain1988 NA 3,702,393,617 9501 Boomers
Bahrain 1988 male 35-54 years 4 63500 6.30 Bahrain1988 NA 3,702,393,617 9501 Silent
Bahrain 1988 female 35-54 years 1 30900 3.24 Bahrain1988 NA 3,702,393,617 9501 Silent
Bahrain 1988 male 15-24 years 1 44100 2.27 Bahrain1988 NA 3,702,393,617 9501 Generation X
Bahrain 1988 female 15-24 years 0 38300 0.00 Bahrain1988 NA 3,702,393,617 9501 Generation X
Bahrain 1988 female 25-34 years 0 32000 0.00 Bahrain1988 NA 3,702,393,617 9501 Boomers
Bahrain 1988 female 5-14 years 0 45400 0.00 Bahrain1988 NA 3,702,393,617 9501 Generation X
Bahrain 1988 female 55-74 years 0 10300 0.00 Bahrain1988 NA 3,702,393,617 9501 G.I. Generation
Bahrain 1988 female 75+ years 0 1400 0.00 Bahrain1988 NA 3,702,393,617 9501 G.I. Generation
Bahrain 1988 male 5-14 years 0 46500 0.00 Bahrain1988 NA 3,702,393,617 9501 Generation X
Bahrain 1988 male 55-74 years 0 14400 0.00 Bahrain1988 NA 3,702,393,617 9501 G.I. Generation
Bahrain 1988 male 75+ years 0 1300 0.00 Bahrain1988 NA 3,702,393,617 9501 G.I. Generation
Bahrain 1997 female 15-24 years 0 46573 0.00 Bahrain1997 NA 6,349,202,394 11985 Generation X
Bahrain 1997 female 25-34 years 0 50841 0.00 Bahrain1997 NA 6,349,202,394 11985 Generation X
Bahrain 1997 female 35-54 years 0 53562 0.00 Bahrain1997 NA 6,349,202,394 11985 Boomers
Bahrain 1997 female 5-14 years 0 55390 0.00 Bahrain1997 NA 6,349,202,394 11985 Millenials
Bahrain 1997 female 55-74 years 0 13826 0.00 Bahrain1997 NA 6,349,202,394 11985 Silent
Bahrain 1997 female 75+ years 0 2250 0.00 Bahrain1997 NA 6,349,202,394 11985 G.I. Generation
Bahrain 1997 male 15-24 years 0 54369 0.00 Bahrain1997 NA 6,349,202,394 11985 Generation X
Bahrain 1997 male 25-34 years 0 80297 0.00 Bahrain1997 NA 6,349,202,394 11985 Generation X
Bahrain 1997 male 35-54 years 0 96825 0.00 Bahrain1997 NA 6,349,202,394 11985 Boomers
Bahrain 1997 male 5-14 years 0 57431 0.00 Bahrain1997 NA 6,349,202,394 11985 Millenials
Bahrain 1997 male 55-74 years 0 16035 0.00 Bahrain1997 NA 6,349,202,394 11985 Silent
Bahrain 1997 male 75+ years 0 2355 0.00 Bahrain1997 NA 6,349,202,394 11985 G.I. Generation
Bahrain 1998 female 15-24 years 0 48147 0.00 Bahrain1998 NA 6,183,776,596 11325 Generation X
Bahrain 1998 female 25-34 years 0 51596 0.00 Bahrain1998 NA 6,183,776,596 11325 Generation X
Bahrain 1998 female 35-54 years 0 56917 0.00 Bahrain1998 NA 6,183,776,596 11325 Boomers
Bahrain 1998 female 5-14 years 0 56626 0.00 Bahrain1998 NA 6,183,776,596 11325 Millenials
Bahrain 1998 female 55-74 years 0 14404 0.00 Bahrain1998 NA 6,183,776,596 11325 Silent
Bahrain 1998 female 75+ years 0 2312 0.00 Bahrain1998 NA 6,183,776,596 11325 G.I. Generation
Bahrain 1998 male 15-24 years 0 56377 0.00 Bahrain1998 NA 6,183,776,596 11325 Generation X
Bahrain 1998 male 25-34 years 0 80849 0.00 Bahrain1998 NA 6,183,776,596 11325 Generation X
Bahrain 1998 male 35-54 years 0 101013 0.00 Bahrain1998 NA 6,183,776,596 11325 Boomers
Bahrain 1998 male 5-14 years 0 58935 0.00 Bahrain1998 NA 6,183,776,596 11325 Millenials
Bahrain 1998 male 55-74 years 0 16409 0.00 Bahrain1998 NA 6,183,776,596 11325 Silent
Bahrain 1998 male 75+ years 0 2423 0.00 Bahrain1998 NA 6,183,776,596 11325 G.I. Generation
Bahrain 1999 male 25-34 years 8 82791 9.66 Bahrain1999 NA 6,621,010,372 11703 Generation X
Bahrain 1999 male 35-54 years 4 105903 3.78 Bahrain1999 NA 6,621,010,372 11703 Boomers
Bahrain 1999 male 15-24 years 2 57914 3.45 Bahrain1999 NA 6,621,010,372 11703 Generation X
Bahrain 1999 female 15-24 years 1 49631 2.01 Bahrain1999 NA 6,621,010,372 11703 Generation X
Bahrain 1999 female 25-34 years 1 52339 1.91 Bahrain1999 NA 6,621,010,372 11703 Generation X
Bahrain 1999 female 35-54 years 1 60568 1.65 Bahrain1999 NA 6,621,010,372 11703 Boomers
Bahrain 1999 female 5-14 years 0 58374 0.00 Bahrain1999 NA 6,621,010,372 11703 Millenials
Bahrain 1999 female 55-74 years 0 15017 0.00 Bahrain1999 NA 6,621,010,372 11703 Silent
Bahrain 1999 female 75+ years 0 2339 0.00 Bahrain1999 NA 6,621,010,372 11703 G.I. Generation
Bahrain 1999 male 5-14 years 0 61554 0.00 Bahrain1999 NA 6,621,010,372 11703 Millenials
Bahrain 1999 male 55-74 years 0 16810 0.00 Bahrain1999 NA 6,621,010,372 11703 Silent
Bahrain 1999 male 75+ years 0 2522 0.00 Bahrain1999 NA 6,621,010,372 11703 G.I. Generation
Bahrain 2000 male 25-34 years 16 86231 18.55 Bahrain2000 0.794 9,062,906,915 15345 Generation X
Bahrain 2000 male 35-54 years 10 112060 8.92 Bahrain2000 0.794 9,062,906,915 15345 Boomers
Bahrain 2000 male 15-24 years 2 59208 3.38 Bahrain2000 0.794 9,062,906,915 15345 Generation X
Bahrain 2000 female 25-34 years 1 53263 1.88 Bahrain2000 0.794 9,062,906,915 15345 Generation X
Bahrain 2000 female 15-24 years 0 50968 0.00 Bahrain2000 0.794 9,062,906,915 15345 Generation X
Bahrain 2000 female 35-54 years 0 64617 0.00 Bahrain2000 0.794 9,062,906,915 15345 Boomers
Bahrain 2000 female 5-14 years 0 60791 0.00 Bahrain2000 0.794 9,062,906,915 15345 Millenials
Bahrain 2000 female 55-74 years 0 15556 0.00 Bahrain2000 0.794 9,062,906,915 15345 Silent
Bahrain 2000 female 75+ years 0 2394 0.00 Bahrain2000 0.794 9,062,906,915 15345 G.I. Generation
Bahrain 2000 male 5-14 years 0 65512 0.00 Bahrain2000 0.794 9,062,906,915 15345 Millenials
Bahrain 2000 male 55-74 years 0 17346 0.00 Bahrain2000 0.794 9,062,906,915 15345 Silent
Bahrain 2000 male 75+ years 0 2648 0.00 Bahrain2000 0.794 9,062,906,915 15345 G.I. Generation
Bahrain 2001 male 25-34 years 8 90920 8.80 Bahrain2001 NA 8,976,207,713 14383 Generation X
Bahrain 2001 male 35-54 years 8 118151 6.77 Bahrain2001 NA 8,976,207,713 14383 Boomers
Bahrain 2001 male 55-74 years 1 17917 5.58 Bahrain2001 NA 8,976,207,713 14383 Silent
Bahrain 2001 male 15-24 years 2 68309 2.93 Bahrain2001 NA 8,976,207,713 14383 Millenials
Bahrain 2001 female 15-24 years 1 54497 1.83 Bahrain2001 NA 8,976,207,713 14383 Millenials
Bahrain 2001 female 25-34 years 1 55961 1.79 Bahrain2001 NA 8,976,207,713 14383 Generation X
Bahrain 2001 male 5-14 years 1 67127 1.49 Bahrain2001 NA 8,976,207,713 14383 Millenials
Bahrain 2001 female 35-54 years 1 67856 1.47 Bahrain2001 NA 8,976,207,713 14383 Boomers
Bahrain 2001 female 5-14 years 0 62316 0.00 Bahrain2001 NA 8,976,207,713 14383 Millenials
Bahrain 2001 female 55-74 years 0 15791 0.00 Bahrain2001 NA 8,976,207,713 14383 Silent
Bahrain 2001 female 75+ years 0 2544 0.00 Bahrain2001 NA 8,976,207,713 14383 Silent
Bahrain 2001 male 75+ years 0 2694 0.00 Bahrain2001 NA 8,976,207,713 14383 Silent
Bahrain 2002 male 25-34 years 7 97288 7.20 Bahrain2002 NA 9,632,155,053 14572 Generation X
Bahrain 2002 male 35-54 years 9 126500 7.11 Bahrain2002 NA 9,632,155,053 14572 Boomers
Bahrain 2002 female 35-54 years 4 71376 5.60 Bahrain2002 NA 9,632,155,053 14572 Boomers
Bahrain 2002 female 15-24 years 2 57599 3.47 Bahrain2002 NA 9,632,155,053 14572 Millenials

Comparing traditional and GG way

Standard way

plot(df$population, df$suicides_no)

Plotting with ggplot / plotnine

ggplot(df, aes(x=population, y=suicides_no)) +
  geom_point()

How to colour points according to sex?

Sex the traditional way (uses “wide” format)

country year age population_male population_female suicides_no_male suicides_no_female
Bahrain 1985 25-34 years 67600 27600 7 0
Bahrain 1985 35-54 years 49700 26900 3 1
Bahrain 1985 15-24 years 49700 37800 0 0
Bahrain 1985 5-14 years 42000 41400 0 0
Bahrain 1985 55-74 years 11600 8700 0 0
Bahrain 1985 75+ years 1400 1500 0 0
Bahrain 1987 55-74 years 11800 8900 1 0
Bahrain 1987 35-54 years 47700 27300 3 0
Bahrain 1987 25-34 years 63900 27500 3 1
Bahrain 1987 15-24 years 49400 38600 0 1
Bahrain 1987 5-14 years 43200 42500 0 0
Bahrain 1987 75+ years 1600 1500 0 0
Bahrain 1988 25-34 years 61600 32000 8 0
Bahrain 1988 35-54 years 63500 30900 4 1
Bahrain 1988 15-24 years 44100 38300 1 0
Bahrain 1988 5-14 years 46500 45400 0 0
Bahrain 1988 55-74 years 14400 10300 0 0
Bahrain 1988 75+ years 1300 1400 0 0
Bahrain 1997 15-24 years 54369 46573 0 0
Bahrain 1997 25-34 years 80297 50841 0 0
Bahrain 1997 35-54 years 96825 53562 0 0
Bahrain 1997 5-14 years 57431 55390 0 0
Bahrain 1997 55-74 years 16035 13826 0 0
Bahrain 1997 75+ years 2355 2250 0 0
Bahrain 1998 15-24 years 56377 48147 0 0
Bahrain 1998 25-34 years 80849 51596 0 0
Bahrain 1998 35-54 years 101013 56917 0 0
Bahrain 1998 5-14 years 58935 56626 0 0
Bahrain 1998 55-74 years 16409 14404 0 0
Bahrain 1998 75+ years 2423 2312 0 0
Bahrain 1999 25-34 years 82791 52339 8 1
Bahrain 1999 35-54 years 105903 60568 4 1
Bahrain 1999 15-24 years 57914 49631 2 1
Bahrain 1999 5-14 years 61554 58374 0 0
Bahrain 1999 55-74 years 16810 15017 0 0
Bahrain 1999 75+ years 2522 2339 0 0
Bahrain 2000 25-34 years 86231 53263 16 1
Bahrain 2000 35-54 years 112060 64617 10 0
Bahrain 2000 15-24 years 59208 50968 2 0
Bahrain 2000 5-14 years 65512 60791 0 0
Bahrain 2000 55-74 years 17346 15556 0 0
Bahrain 2000 75+ years 2648 2394 0 0
Bahrain 2001 25-34 years 90920 55961 8 1
Bahrain 2001 35-54 years 118151 67856 8 1
Bahrain 2001 55-74 years 17917 15791 1 0
Bahrain 2001 15-24 years 68309 54497 2 1
Bahrain 2001 5-14 years 67127 62316 1 0
Bahrain 2001 75+ years 2694 2544 0 0
Bahrain 2002 25-34 years 97288 59350 7 1
Bahrain 2002 35-54 years 126500 71376 9 4
Bahrain 2002 15-24 years 76442 57599 2 2
Bahrain 2002 5-14 years 68596 63714 0 0
Bahrain 2002 55-74 years 18646 16006 0 0
Bahrain 2002 75+ years 2755 2738 0 0
Bahrain 2003 35-54 years 137287 75430 17 0
Bahrain 2003 25-34 years 105799 63380 9 5
Bahrain 2003 15-24 years 83143 60298 3 0
Bahrain 2003 5-14 years 70584 65298 0 0
Bahrain 2003 55-74 years 19574 16343 0 0
Bahrain 2003 75+ years 2831 2946 0 0
Bahrain 2004 55-74 years 20717 16975 4 0
Bahrain 2004 25-34 years 116777 68052 19 3
Bahrain 2004 35-54 years 150835 80288 10 3
Bahrain 2004 15-24 years 88431 62699 2 1
Bahrain 2004 5-14 years 73422 67313 0 0
Bahrain 2004 75+ years 2924 3126 0 0
Bahrain 2005 25-34 years 142375 73758 15 3
Bahrain 2005 35-54 years 174785 92102 14 2
Bahrain 2005 55-74 years 27261 19983 2 0
Bahrain 2005 15-24 years 79928 64792 4 2
Bahrain 2005 5-14 years 69742 66362 0 0
Bahrain 2005 75+ years 4063 4389 0 0
Bahrain 2006 25-34 years 158656 80271 12 5
Bahrain 2006 15-24 years 86090 69020 1 3
Bahrain 2006 55-74 years 29960 21363 1 0
Bahrain 2006 35-54 years 192597 100668 3 3
Bahrain 2006 5-14 years 71974 68590 0 0
Bahrain 2006 75+ years 4439 4915 0 0
Bahrain 2007 25-34 years 176938 87472 14 3
Bahrain 2007 35-54 years 212426 110173 13 2
Bahrain 2007 55-74 years 32965 22872 2 0
Bahrain 2007 15-24 years 92875 73617 5 1
Bahrain 2007 5-14 years 74222 70861 0 0
Bahrain 2007 75+ years 4852 5497 0 0
Bahrain 2008 25-34 years 196832 93449 21 2
Bahrain 2008 35-54 years 225866 115468 13 3
Bahrain 2008 15-24 years 96458 76246 0 3
Bahrain 2008 55-74 years 35879 24495 1 0
Bahrain 2008 5-14 years 76617 72955 0 0
Bahrain 2008 75+ years 4654 5191 0 0
Bahrain 2009 35-54 years 242558 118835 14 2
Bahrain 2009 25-34 years 222448 99580 9 4
Bahrain 2009 15-24 years 102256 80832 4 0
Bahrain 2009 55-74 years 38002 25915 1 0
Bahrain 2009 5-14 years 79180 75122 2 0
Bahrain 2009 75+ years 4091 4103 0 0
Bahrain 2010 35-54 years 252232 125709 8 2
Bahrain 2010 25-34 years 235591 104368 6 1
Bahrain 2010 55-74 years 41999 27834 1 0
Bahrain 2010 15-24 years 103388 80571 0 1

Traditional: plot data separately for each subpopulation

plot(df_wide$population_male, df_wide$suicides_no_male, col="red")
points(df_wide$population_female, df_wide$suicides_no_female, col="blue")

GG way uses long format

country year sex age suicides_no population suicides.100k.pop country.year HDI.for.year gdp_for_year…. gdp_per_capita…. generation
Bahrain 1985 male 25-34 years 7 67600 10.36 Bahrain1985 0.727 3,651,861,702 9980 Boomers
Bahrain 1985 male 35-54 years 3 49700 6.04 Bahrain1985 0.727 3,651,861,702 9980 Silent
Bahrain 1985 female 35-54 years 1 26900 3.72 Bahrain1985 0.727 3,651,861,702 9980 Silent
Bahrain 1985 female 15-24 years 0 37800 0.00 Bahrain1985 0.727 3,651,861,702 9980 Generation X
Bahrain 1985 female 25-34 years 0 27600 0.00 Bahrain1985 0.727 3,651,861,702 9980 Boomers
Bahrain 1985 female 5-14 years 0 41400 0.00 Bahrain1985 0.727 3,651,861,702 9980 Generation X
Bahrain 1985 female 55-74 years 0 8700 0.00 Bahrain1985 0.727 3,651,861,702 9980 G.I. Generation
Bahrain 1985 female 75+ years 0 1500 0.00 Bahrain1985 0.727 3,651,861,702 9980 G.I. Generation
Bahrain 1985 male 15-24 years 0 49700 0.00 Bahrain1985 0.727 3,651,861,702 9980 Generation X
Bahrain 1985 male 5-14 years 0 42000 0.00 Bahrain1985 0.727 3,651,861,702 9980 Generation X
Bahrain 1985 male 55-74 years 0 11600 0.00 Bahrain1985 0.727 3,651,861,702 9980 G.I. Generation
Bahrain 1985 male 75+ years 0 1400 0.00 Bahrain1985 0.727 3,651,861,702 9980 G.I. Generation
Bahrain 1987 male 55-74 years 1 11800 8.47 Bahrain1987 NA 3,392,021,011 9321 G.I. Generation
Bahrain 1987 male 35-54 years 3 47700 6.29 Bahrain1987 NA 3,392,021,011 9321 Silent
Bahrain 1987 male 25-34 years 3 63900 4.69 Bahrain1987 NA 3,392,021,011 9321 Boomers
Bahrain 1987 female 25-34 years 1 27500 3.64 Bahrain1987 NA 3,392,021,011 9321 Boomers
Bahrain 1987 female 15-24 years 1 38600 2.59 Bahrain1987 NA 3,392,021,011 9321 Generation X
Bahrain 1987 female 35-54 years 0 27300 0.00 Bahrain1987 NA 3,392,021,011 9321 Silent
Bahrain 1987 female 5-14 years 0 42500 0.00 Bahrain1987 NA 3,392,021,011 9321 Generation X
Bahrain 1987 female 55-74 years 0 8900 0.00 Bahrain1987 NA 3,392,021,011 9321 G.I. Generation
Bahrain 1987 female 75+ years 0 1500 0.00 Bahrain1987 NA 3,392,021,011 9321 G.I. Generation
Bahrain 1987 male 15-24 years 0 49400 0.00 Bahrain1987 NA 3,392,021,011 9321 Generation X
Bahrain 1987 male 5-14 years 0 43200 0.00 Bahrain1987 NA 3,392,021,011 9321 Generation X
Bahrain 1987 male 75+ years 0 1600 0.00 Bahrain1987 NA 3,392,021,011 9321 G.I. Generation
Bahrain 1988 male 25-34 years 8 61600 12.99 Bahrain1988 NA 3,702,393,617 9501 Boomers
Bahrain 1988 male 35-54 years 4 63500 6.30 Bahrain1988 NA 3,702,393,617 9501 Silent
Bahrain 1988 female 35-54 years 1 30900 3.24 Bahrain1988 NA 3,702,393,617 9501 Silent
Bahrain 1988 male 15-24 years 1 44100 2.27 Bahrain1988 NA 3,702,393,617 9501 Generation X
Bahrain 1988 female 15-24 years 0 38300 0.00 Bahrain1988 NA 3,702,393,617 9501 Generation X
Bahrain 1988 female 25-34 years 0 32000 0.00 Bahrain1988 NA 3,702,393,617 9501 Boomers
Bahrain 1988 female 5-14 years 0 45400 0.00 Bahrain1988 NA 3,702,393,617 9501 Generation X
Bahrain 1988 female 55-74 years 0 10300 0.00 Bahrain1988 NA 3,702,393,617 9501 G.I. Generation
Bahrain 1988 female 75+ years 0 1400 0.00 Bahrain1988 NA 3,702,393,617 9501 G.I. Generation
Bahrain 1988 male 5-14 years 0 46500 0.00 Bahrain1988 NA 3,702,393,617 9501 Generation X
Bahrain 1988 male 55-74 years 0 14400 0.00 Bahrain1988 NA 3,702,393,617 9501 G.I. Generation
Bahrain 1988 male 75+ years 0 1300 0.00 Bahrain1988 NA 3,702,393,617 9501 G.I. Generation
Bahrain 1997 female 15-24 years 0 46573 0.00 Bahrain1997 NA 6,349,202,394 11985 Generation X
Bahrain 1997 female 25-34 years 0 50841 0.00 Bahrain1997 NA 6,349,202,394 11985 Generation X
Bahrain 1997 female 35-54 years 0 53562 0.00 Bahrain1997 NA 6,349,202,394 11985 Boomers
Bahrain 1997 female 5-14 years 0 55390 0.00 Bahrain1997 NA 6,349,202,394 11985 Millenials
Bahrain 1997 female 55-74 years 0 13826 0.00 Bahrain1997 NA 6,349,202,394 11985 Silent
Bahrain 1997 female 75+ years 0 2250 0.00 Bahrain1997 NA 6,349,202,394 11985 G.I. Generation
Bahrain 1997 male 15-24 years 0 54369 0.00 Bahrain1997 NA 6,349,202,394 11985 Generation X
Bahrain 1997 male 25-34 years 0 80297 0.00 Bahrain1997 NA 6,349,202,394 11985 Generation X
Bahrain 1997 male 35-54 years 0 96825 0.00 Bahrain1997 NA 6,349,202,394 11985 Boomers
Bahrain 1997 male 5-14 years 0 57431 0.00 Bahrain1997 NA 6,349,202,394 11985 Millenials
Bahrain 1997 male 55-74 years 0 16035 0.00 Bahrain1997 NA 6,349,202,394 11985 Silent
Bahrain 1997 male 75+ years 0 2355 0.00 Bahrain1997 NA 6,349,202,394 11985 G.I. Generation
Bahrain 1998 female 15-24 years 0 48147 0.00 Bahrain1998 NA 6,183,776,596 11325 Generation X
Bahrain 1998 female 25-34 years 0 51596 0.00 Bahrain1998 NA 6,183,776,596 11325 Generation X
Bahrain 1998 female 35-54 years 0 56917 0.00 Bahrain1998 NA 6,183,776,596 11325 Boomers
Bahrain 1998 female 5-14 years 0 56626 0.00 Bahrain1998 NA 6,183,776,596 11325 Millenials
Bahrain 1998 female 55-74 years 0 14404 0.00 Bahrain1998 NA 6,183,776,596 11325 Silent
Bahrain 1998 female 75+ years 0 2312 0.00 Bahrain1998 NA 6,183,776,596 11325 G.I. Generation
Bahrain 1998 male 15-24 years 0 56377 0.00 Bahrain1998 NA 6,183,776,596 11325 Generation X
Bahrain 1998 male 25-34 years 0 80849 0.00 Bahrain1998 NA 6,183,776,596 11325 Generation X
Bahrain 1998 male 35-54 years 0 101013 0.00 Bahrain1998 NA 6,183,776,596 11325 Boomers
Bahrain 1998 male 5-14 years 0 58935 0.00 Bahrain1998 NA 6,183,776,596 11325 Millenials
Bahrain 1998 male 55-74 years 0 16409 0.00 Bahrain1998 NA 6,183,776,596 11325 Silent
Bahrain 1998 male 75+ years 0 2423 0.00 Bahrain1998 NA 6,183,776,596 11325 G.I. Generation
Bahrain 1999 male 25-34 years 8 82791 9.66 Bahrain1999 NA 6,621,010,372 11703 Generation X
Bahrain 1999 male 35-54 years 4 105903 3.78 Bahrain1999 NA 6,621,010,372 11703 Boomers
Bahrain 1999 male 15-24 years 2 57914 3.45 Bahrain1999 NA 6,621,010,372 11703 Generation X
Bahrain 1999 female 15-24 years 1 49631 2.01 Bahrain1999 NA 6,621,010,372 11703 Generation X
Bahrain 1999 female 25-34 years 1 52339 1.91 Bahrain1999 NA 6,621,010,372 11703 Generation X
Bahrain 1999 female 35-54 years 1 60568 1.65 Bahrain1999 NA 6,621,010,372 11703 Boomers
Bahrain 1999 female 5-14 years 0 58374 0.00 Bahrain1999 NA 6,621,010,372 11703 Millenials
Bahrain 1999 female 55-74 years 0 15017 0.00 Bahrain1999 NA 6,621,010,372 11703 Silent
Bahrain 1999 female 75+ years 0 2339 0.00 Bahrain1999 NA 6,621,010,372 11703 G.I. Generation
Bahrain 1999 male 5-14 years 0 61554 0.00 Bahrain1999 NA 6,621,010,372 11703 Millenials
Bahrain 1999 male 55-74 years 0 16810 0.00 Bahrain1999 NA 6,621,010,372 11703 Silent
Bahrain 1999 male 75+ years 0 2522 0.00 Bahrain1999 NA 6,621,010,372 11703 G.I. Generation
Bahrain 2000 male 25-34 years 16 86231 18.55 Bahrain2000 0.794 9,062,906,915 15345 Generation X
Bahrain 2000 male 35-54 years 10 112060 8.92 Bahrain2000 0.794 9,062,906,915 15345 Boomers
Bahrain 2000 male 15-24 years 2 59208 3.38 Bahrain2000 0.794 9,062,906,915 15345 Generation X
Bahrain 2000 female 25-34 years 1 53263 1.88 Bahrain2000 0.794 9,062,906,915 15345 Generation X
Bahrain 2000 female 15-24 years 0 50968 0.00 Bahrain2000 0.794 9,062,906,915 15345 Generation X
Bahrain 2000 female 35-54 years 0 64617 0.00 Bahrain2000 0.794 9,062,906,915 15345 Boomers
Bahrain 2000 female 5-14 years 0 60791 0.00 Bahrain2000 0.794 9,062,906,915 15345 Millenials
Bahrain 2000 female 55-74 years 0 15556 0.00 Bahrain2000 0.794 9,062,906,915 15345 Silent
Bahrain 2000 female 75+ years 0 2394 0.00 Bahrain2000 0.794 9,062,906,915 15345 G.I. Generation
Bahrain 2000 male 5-14 years 0 65512 0.00 Bahrain2000 0.794 9,062,906,915 15345 Millenials
Bahrain 2000 male 55-74 years 0 17346 0.00 Bahrain2000 0.794 9,062,906,915 15345 Silent
Bahrain 2000 male 75+ years 0 2648 0.00 Bahrain2000 0.794 9,062,906,915 15345 G.I. Generation
Bahrain 2001 male 25-34 years 8 90920 8.80 Bahrain2001 NA 8,976,207,713 14383 Generation X
Bahrain 2001 male 35-54 years 8 118151 6.77 Bahrain2001 NA 8,976,207,713 14383 Boomers
Bahrain 2001 male 55-74 years 1 17917 5.58 Bahrain2001 NA 8,976,207,713 14383 Silent
Bahrain 2001 male 15-24 years 2 68309 2.93 Bahrain2001 NA 8,976,207,713 14383 Millenials
Bahrain 2001 female 15-24 years 1 54497 1.83 Bahrain2001 NA 8,976,207,713 14383 Millenials
Bahrain 2001 female 25-34 years 1 55961 1.79 Bahrain2001 NA 8,976,207,713 14383 Generation X
Bahrain 2001 male 5-14 years 1 67127 1.49 Bahrain2001 NA 8,976,207,713 14383 Millenials
Bahrain 2001 female 35-54 years 1 67856 1.47 Bahrain2001 NA 8,976,207,713 14383 Boomers
Bahrain 2001 female 5-14 years 0 62316 0.00 Bahrain2001 NA 8,976,207,713 14383 Millenials
Bahrain 2001 female 55-74 years 0 15791 0.00 Bahrain2001 NA 8,976,207,713 14383 Silent
Bahrain 2001 female 75+ years 0 2544 0.00 Bahrain2001 NA 8,976,207,713 14383 Silent
Bahrain 2001 male 75+ years 0 2694 0.00 Bahrain2001 NA 8,976,207,713 14383 Silent
Bahrain 2002 male 25-34 years 7 97288 7.20 Bahrain2002 NA 9,632,155,053 14572 Generation X
Bahrain 2002 male 35-54 years 9 126500 7.11 Bahrain2002 NA 9,632,155,053 14572 Boomers
Bahrain 2002 female 35-54 years 4 71376 5.60 Bahrain2002 NA 9,632,155,053 14572 Boomers
Bahrain 2002 female 15-24 years 2 57599 3.47 Bahrain2002 NA 9,632,155,053 14572 Millenials

GG: associate extra aesthetic (“colour”) to data points

ggplot(df, aes(x=population, y=suicides_no, colour=sex)) +
  geom_point()

Agile data visualisation with GG

Plots operate on aesthetic mappings

aes(x=population, y=suicides_no, colour=sex)

  • is an example of an aesthetic mapping: it associates aesthetics with values for each of your data points
  • here it associates:
    • the horizontal position of data points with population
    • the vertical with suicides count
    • the colour of points with sex
  • many other aesthetics, such as shape and size, are possible
  • geoms (see later) control how aesthetics are displayed

Colour by country: traditionally annoying

ggplot(df, aes(x=population, y=suicides_no, colour=country)) +
  geom_point()

Shape by country: traditionally annoying

ggplot(df, aes(x=population, y=suicides_no, shape=country)) +
  geom_point()

Regressions by country

ggplot(df, aes(x=population, y=suicides_no, colour=country)) +
  geom_point(alpha=0.3) + geom_smooth(method="lm", se=F)

Overall regression

ggplot(df, aes(x=population, y=suicides_no)) +
  geom_point(alpha=0.3, aes(colour=country)) +
  geom_smooth(method="lm", se=F, colour="black")

What is a geom?

  • geom_point and geom_smooth are both geometrical elements (“geoms”) used to represent data
  • here they both take the same x and y variable and use it to produce a different visualisation
  • other geom examples are geom_line, geom_histogram, geom_violin, geom_rectangle

Simple geom example

x y label
1 2 a
2 4 b
3 10 c

point

ggplot(data_df, aes(x, y, label = label)) +
  theme(text=element_text(size=14)) +
  geom_point()

text

ggplot(data_df, aes(x, y, label = label)) +
  theme(text=element_text(size=14)) +
  geom_text(size=18)

col

ggplot(data_df, aes(x, y, label = label)) +
  theme(text=element_text(size=14)) +
  geom_col()

line

ggplot(data_df, aes(x, y, label = label)) +
  theme(text=element_text(size=14)) +
  geom_line()

line and points

ggplot(data_df, aes(x, y, label = label)) +
  theme(text=element_text(size=14)) +
  geom_line() + geom_point()

line and jittered points

ggplot(data_df, aes(x, y, label = label)) +
  theme(text=element_text(size=14)) +
  geom_line() + geom_jitter()

regression line and jitter

ggplot(data_df, aes(x, y, label = label)) +
  theme(text=element_text(size=14)) +
  geom_smooth(method="lm", se=F, formula = y~x) + geom_jitter()

polygon

ggplot(data_df, aes(x, y, label = label)) +
  theme(text=element_text(size=14)) +
  geom_polygon()

Mini-breather: any questions?

Order of layering

ggplot(df, aes(x=population, y=suicides_no)) +
  geom_point(alpha=1, aes(colour=country)) +
  geom_smooth(method="lm", se=F, colour="black")

Order of layering

ggplot(df, aes(x=population, y=suicides_no)) +
  geom_smooth(method="lm", se=F, colour="black") +
  geom_point(alpha=1, aes(colour=country))

Change axis scales

ggplot(df, aes(x=population, y=suicides_no)) +
  geom_point(alpha=0.3, aes(colour=country)) + scale_x_sqrt() + scale_y_sqrt()

Boxplots

ggplot(df, aes(x=as.factor(year), y=suicides_no)) +
  geom_boxplot()

Boxplots flipped

ggplot(df, aes(x=as.factor(year), y=suicides_no)) +
  geom_boxplot() +
  coord_flip()

Separate by age group

ggplot(df, aes(x=year, y=suicides_no, colour=country, shape=age)) +
  geom_point(alpha=0.8)

Facet panelling

  • one way to add variables is with aesthetics
  • another way, especially useful for categorical variables, is to split plots into facets
  • each facet represents a plot of a subset of your data

Facet by country

ggplot(df, aes(x=year, y=suicides_no, shape=age)) +
  geom_point() + facet_wrap(~country)

Separate panel for age and country

ggplot(df, aes(x=year, y=suicides_no)) +
  geom_point() + facet_grid(vars(country), vars(age), scales="free")

Add in sex

ggplot(df, aes(x=year, y=suicides_no, colour=sex)) +
  geom_point() + facet_grid(vars(country), vars(age), scales="free")

Change geom to line

ggplot(df, aes(x=year, y=suicides_no, colour=sex)) +
  geom_line() + facet_grid(vars(country), vars(age), scales="free")

Adding linear regressions

ggplot(df, aes(x=year, y=suicides_no, colour=sex)) +
  geom_line() + facet_grid(vars(country), vars(age), scales="free") +
  geom_smooth(method="lm", se=F)

Conclusions

Benefits of GG

  • agile data exploration: keep tinkering till it looks right
  • aesthetics allow layering of hierarchies of features
  • geoms handle a lot so less to get wrong
  • dominates traditional graphics for rich datasets

Packages

  • R: ggplot2
  • Python: plotnine (essentially ggplot2) and Plotly

Where to learn more

  • “ggplot2” free online book by Hadley Wickham
  • “R for data science” free online book by Hadley Wickham